In silico prediction and segregation analysis of putative virus defense genes based on SSR markers in sweet potato F1 progenies of cultivars ‘New Kawogo’ and ‘Resisto’

In sweet potato, an anti-virus defense mechanism termed reversion has been postulated to lead to virus freedom from once infected plants. The objectives of this study were to identify anti-virus defense genes and evaluate their segregation in progenies. Reference genes from different plant species were used to assemble transcript sequences of each sweet potato defense gene in silico. Sequences were used for evaluate phylogenetic relationships with similar genes from different plant species, mining respective defense genes and thereafter developing simple sequence repeats (SSRs) for segregation analysis. Eight potential defense genes were identified: RNA dependent RNA polymerases 1, 2, 5, and 6; Argonaute 1, and Dicer-like 1, 2, and 4. Identified genes were differentially related to those of other plants and were observed on different chromosomes. The defense genes contained mono-, di-, tri-, tetra, penta-, and hexa-nucleotide repeat motifs. The SSR markers within progenies were segregated in disomic, co-segregation, nullisomic, monosomic, and trisomic modes. These findings indicate the possibility of deriving and utilizing SSRs using published genomic information. Furthermore, and given that the SSR markers were derived from known genes on defined chromosomes, this work will contribute to future molecular breeding and development of resistance gene analogs in this economically important crop.


INTRODUCTION
Sweet potato (Ipomoea batatas (L.) Lam.) production is severely affected by virus diseases that cause yield losses of up to 98% in individual plants (Gibson et al., 1998). Sweet potato is propagated vegetatively using vines as planting material, and farmers use vines from their own crop or traded with other farmers *Corresponding author. E-mail: wasswa@caes.mak.ac.ug. Tel: +256(0)782762081.
Author(s) agree that this article remain permanently open access under the terms of the Creative Commons Attribution License 4.0 International License (Rachkara et al., 2017) to plant their gardens (Mukasa et al., 2003). The long-standing traditional practice of selecting healthy-looking vine plants as source material, coupled with low levels of symptomatic expression of many single virus infections, have led to the maintenance and proliferation of many viral pathogens (Rachkara et al., 2017). However, the expected high levels of viral prevalence throughout Uganda and consequent reduced yields have not materialized. It has been observed that previously infected field grown plants of East African sweet potato cultivars may become virus free (Adikini et al., 2016); this phenomenon is termed reversion. Similarly, a number of studies have reported that plants previously infected with Sweet potato feathery mottle virus became virus free (Green et al., 1988;Abad and Moyer, 1992;Gibb and Podovan, 1993;. Gibson and Kreuze (2014) reviewed reversion in East African sweet potato cultivars, and suggested that it is a result of an RNA silencing mechanism triggered in plants by defense genes. The trigger for this plant defense response is the accumulation of viral dsRNA molecules in a replicative form or viral RNA secondary structures. Plant gene products, such as the RNA dependent RNA polymerases (RDRs), are part of the gene silencing machinery that independently synthesizes viral dsRNA in an amplification step for viral small RNA (21 to 24 nts) production. The dsRNAs are processed by Dicer-like (DCL) proteins to small RNAs, which are subsequently incorporated into the RNA-induced silencing complex with an Argonaute (AGO) protein that uses complementary small RNAs to target viral RNA (Baulcombe, 2004;Peragine et al., 2004;Hunter et al., 2016;Leibman et al., 2017). Thus, identification of putative defense genes involved in gene silencing is important in plant breeding for the management of virus diseases.
One of the methods of studying virus resistance inheritance is through genetic analysis (Mwanga et al., 2002). In this regard, simple sequence repeats (SSRs) are genetic markers that have received particular attention because they are highly informative, codominant, multi-allelic and are experimentally reproducible and transferable among related species (Mason, 2015). SSRs are used for various purposes. These include studies of diversity measured on the basis of genetic distance, evolutionary studies, constructing linkage maps, mapping loci involved in quantitative traits, estimating the degree of kinship between genotypes, marker-assisted selection, defining cultivar DNA fingerprints and estimating gene flow (segregation) in populations (Vieira et al., 2016).
Therefore, this study aimed to identify potential defense genes that may be responsible for reversion against virus infection, and evaluate their segregation patterns using SSR markers.

Plant
Sweet potato cultivars "New Kawogo" and "Resisto" sourced from virus-free sweet potato collections at the Makerere University Agricultural Research Institute (MUARIK) and Namulonge Crops Resources Research Institute, respectively were used. The white fleshed "New Kawogo" is Ugandan in origin and is virus resistant (Gasura and Mukasa, 2010;Mwanga et al., 2016), while the orange fleshed "Resisto" from the USA is virus susceptible (Mwanga and Ssemakula, 2011). These cultivars were used as parents in a full diallel cross, with reciprocals considered (Griffing, 1956). Resulting seeds were harvested and planted in pots containing sterile potting mix that were then placed in an insect proof screenhouse at MUARIK. Imidacloprid and cypermethrin were applied weekly to control whitefly and aphid vectors of viruses. Each germinated seed was considered a progeny and was grown for 2 months prior to propagation using cuttings that were subsequently established in pots in an insect proof screenhouse at MUARIK.

In silico prediction of defense genes
Defense gene transcript sequences from different plant species were obtained from the National Center for Biotechnology Information (NCBI) (www.ncbi.nlm.nih.gov) using Basic Local Alignment Search Tool (BLASTn) (Altschul et al., 1997), and used as references to derive similar gene sequences for sweet potato. The functions of the reference transcript sequences were verified using the Kyoto Encyclopedia of Genes and Genomes (www.genome.jp/kegg/), the Sol Genomics Network (SOL Genomics.net), and Phytozome (phytozome.net). The reference sequences were BLASTn searched in the Sweet potato genomics resource website (sweetpotato.plantbiology.msu.edu); this process identified homologous sequences within the genomes of the sweet potato relatives Ipomoea trifida and Ipomoea triloba (Wu et al., 2018). Then, these homologous sequences were used as a template to run a local BLASTn search within a database created in CLC genomic workbench software that was uploaded with a NOTEPAD file of transcript data and chromosomal locations sourced from the Sweet potato genome website (public-genomesngs.molgen.mpg.de/SweetPotato/).
Local BLASTn searches were conducted twice during in silico evaluation, where the first involved using high stringency parameters with the expectation value set at E -10 ; transcripts derived at this stringency level were denoted or assigned names depending on number of hits and level of homology. The second search was based on a low level of stringency, with the expectation value set at E -6 , and names were assigned as before. This process revealed partial potential sweet potato virus defense gene transcripts and their respective chromosomal locations.
Further, the evolutionary relationship of each defense gene was estimated. This was done using the derived sweet potato virus defense gene transcripts and homologous gene transcripts (of different plant species) sourced from NCBI and sweetpotato.plantbiology.msu.edu.
Phylogenetic trees were constructed using maximum likelihood method and following the Jukes and Cantor model (1969) in the CLC workbench. Observations were validated using Unipro UGENE software (Okonechnikov, 2012). Sequences used for rooting the phylogenetic tree were selected randomly.
Partial transcript sequences of sweet potato were also used as templates for mining full DNA sequences from the sweet potato genome website (public-genomesngs.molgen.mpg.de/SweetPotato/) using BLATn searching of the sweet potato genome (public-genomesngs.molgen.mpg.de/SweetPotato/; Yang et al., 2017). This process product of mining genomic DNA sequences on their respective chromosomes were screened for coding and non-coding regions using the MUSCLE (Edgar, 2004) sequence alignment program on the Unipro UGENE platform (Okonechnikoy et al., 2012). These regions were verified using online tools -GENESCAN, GENOMESCAN and CLC genomic workbench.

Simple sequence repeats mining
DNA sequences within the coding regions were analyzed and screened for simple sequence repeats (SSRs) using WebSat software (wsmartins.net/websat) (Martins et al., 2009). This software was also used to generate SSR-based primers for analysis of segregation of the SSRs in the parental cultivars and their progenies. Outliers (sweet potato cultivars "Ejumula"and "Tanzania" and the sweet potato relative Ipomoea setosa) were included. Previous work has shown that "Ejumula" is susceptible to virus infections (Mwanga et al., 2007), while "Tanzania" is moderately resistant (Gasura and Mukasa, 2010). The virus sensitive I. setosa is often used during virus diagnostics in sweet potato (Fuentes, 2010).

Genomic DNA extraction
Genomic DNA of parents, progeny genotypes, and outliers was isolated using a modified version of the CTAB method (Maruthi et al., 2002). DNA quality was established using a NanoDrop-ND-1000 spectrophotometer (Thermo Scientific, Bargal Analytical Instruments, Airport City, Israel), where DNA was diluted to 50 ng and used for downstream analysis. DNA was visualized on 1% agarose gel (VWN International) that was prepared by dissolving it in 0.5% Tris-Borate Acid (TBE) buffer, then warming in a microwave oven, followed by cooling over running tap water. Agarose gels were mixed with ethidium bromide (HyLabs, Rehovot, Israel), cast, and allowed to cool for 30 to 40 min. Then, 5 µl of diluted genomic DNA was mixed with 5 µl of loading dye (prepared using 0.25% bromophenol blue, 0.25% xylene cyanol, and 30% glycerol), loaded to the agarose gel, and run in the gel tank for 15 min at 120 V (Clever Scientific, Image Care, Kampala, Uganda).

PCR amplification
Annealing temperature was optimized during amplification of the in silico derived primers using a gradient of eight temperatures on a PCR machine (Clever Scientific, Image Care, Kampala, Uganda). The optimal temperature that amplified polymorphic bands was used for subsequent evaluations. The 10 µl PCR master mix contained 3 µl of water, 5 µl of PCR mix (HyLabs Ready Mix [×2], HyLabs, Rehovot, Israel), 0.5 µl of each forward and reverse primer (10 pmol), and 1 µl of DNA (50 ng). The PCR conditions for SSR amplification were an initial denaturation at 94°C for 4 min, followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 50°C for 45 s, extension at 72°C for 30 s, and final extension at 72°C for 10 min. A 3% agarose gel was used for visualization: the gel was warmed in a microwave oven, cooled, and then ethidium bromide was added and mixed. After cooling and setting, the gel was submerged in a gel tank containing 1x TBE buffer. 7 µl of PCR product was mixed with 2 µl of loading dye into the gel that was run at 50 V for 60 min.

Band scoring
Bands were visualized on a gel documentation system (Clever Scientific, Image Care, Kampala, Uganda), where we first analyzed parental cultivars to evaluate and differentiate homozygous from heterozygous bands, according to Guo et al., (2015). A single band was considered to be homozygous at that locus or marker on the gene, while double bands were considered heterozygous. Therefore, primers that revealed double bands in one or both parents were used for progeny segregation analysis. Homozygous SSRs were also used to validate homozygosity in progenies. Bands were scored using binary counts of presence/absence (1/0) criteria.

Segregation evaluation
Segregation was evaluated according to methods used by Zou et al. (2006) and Stift et al., (2008). SSRs that revealed double bands in parental genotypes were considered to be two markers of the same gene; thus, single bands had one marker of that gene. It was assumed that on crossing two markers (that is, A, a), the progeny followed the Mendelian segregation pattern, in a 1:2:1 ratio (1AA:2Aa:1aa). This was considered as disomic inheritance (Zou et al., 2006;Stift et al., 2008) and was revealed as three bands on the gel (Guo et al., 2015). Inheritance mode was classified following Zou et al., (2006), Stift et al., (2008), and Guo et al., (2015), where SSRs that revealed two bands were co-segregating, and zero, one, and four bands were considered as nullisomic, monosomic and tetrasomic inheritance, respectively. Chi-square goodness of fit analysis within XLSTAT (Addinsoft, 2017) was used to test the fit of segregation ratios of SSR markers to the disomic inheritance ratio of 1:2:1 at P ≤ 0.01.

Defense genes and SSRs
In silico prediction identified eight defense gene families of I. batatas (denoted Ib) (IbRDR1, IbRDR2, IbRDR5, IbRDR6, IbAGO1, IbDCL1, IbDCL2, and IbDCL4), located on 10 chromosomes (Table 1). There were six variants of IbRDR1; two of these (IbRDR1a3 and IbRDR1b1) were not used (during segregation analysis), because they were highly (98%) homologous, and the other four variants were located on chromosomes 8 and 1 (Table 1). Two variants were found of IbRDR5 located on chromosomes 14 and 11; two variants of IbDCL1 located on chromosomes 1 and 9; and, three variants of IbDCL2 located on chromosome number 12, 13, and 6. There were no variants of the remaining genes ( Table 1).

Abundance of SSRs
Mononucleotide, dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide repeats were detected within the various coding regions of the defense genes. Pentanucleotide repeats were the most abundant (52.04%), while the least abundant were trinucleotide and tetranucleotide repeats (both were 4.09%) ( Table 2). The highest proportion of repeats was observed in IbRDR1a2 (44.67%), while the lowest was in IbRDR6 (0.44%); and all forms of repeat were observed in IbRDR1a2 (Table 2).

Phylogenetic relationships of sweet potato defense genes to similar genes in other plant species
The phylogenetic relationship revealed that some putative sweet potato (I. batatas -Ib) defense genes or their variants had recently evolved and are either closely related to I. trifida or I. triloba; yet distantly related to those of other plant species.
The relationship of six species of IbRDR1 varied. The IbRDR1a1, IbRDR1a2, IbRDR1a3 evolved earlier than other IbRDR1s, though diverged from I. triloba RDR1 variant 1. The IbRDR1b2 and variant IbRDR1a4 appeared to have recently evolved. All the IbRDR1s are related to variants of either I. trifida, I. triloba or Ipomoea nil RDR1 (Appendix Figure 1). The RDR1 variants of other plant species like Cucurbita species, Nicotiana species, Hevea brasillensis, Manihot esculenta and Oleo europaea (Appendix Figure 1) are distantly related to I. batatas RDR1 and its variants. Also, according to the phylogram, I. batatas RDR2 recently evolved from I. trifida and I. triloba; though share a common ancestor These RDR2 also diverged extensively from I. batatas RDR2 (Appendix Figure 2). When RDR5 of different plants was estimated, it was observed that RDR5 of all Ipomoea spp. evolved earlier than the RDR5 of other plant species. The IbRDR5b evolved earlier than IbRDR5a. The IbRDR5b clustered with I. triloba RDR5 variants yet IbRDR5a clustered with those of I. nil and I. trifida (Appendix Figure 3). On the other hand, the RDR6 of Nicotiana and Solanum spp. evolved much earlier than that of Ipomoea spp. Regarding the respective Ipomoea spp., I. batatas RDR6 evolved earlier than RDR6 of I. trifida, I. triloba and I. nil (Appendix Figure 4).
The I. batatas AGO1 and I. trifida AGO1 are closely related and share I. nil AGO1 as a phylogenetic ancestor. Further, whereas the AGO1 of all Ipomoea spp. is related to the AGO1 of Nicotana and Solanum spp., they diverged earlier from those of fruit trees like V. vinifera and Citrus sinensis among others (Appendix Figure 5). Additionally, with the exception of DCL1 from Solanum spp. and Nicotiana tabacum, the DCL1 of Ipomoea spp. has recently evolved. In particular, I. batatas DCL1b and IbDCL1a evolved earlier than DCL1 of I. trifida or I. triloba; though highly related. The DCL1 of other species sampled (for instance Theobroma cacao, Hevea brasiliensis, M. esulenta among others) evolved much earlier than I. batatas DCL1 (Appendix Figure 6).
The phylogram showed that I. batatas DCL2b and IbDCL2c diverged from I. nil DCL2 and its variants. The IbDCL2b and IbDCL2c evolved earlier than DCL2 from I. triloba and I. trifida. The IbDCL2a has recently evolved though related to I. triloba and I. trifida. The DCL2 of plants like Capsicum annum and Solanum lycopersicum evolved earlier than that of Ipomoea spp. (Appendix Figure 7). The I. batatas DCL4 is closely related to I. triloba and have recently evolved. The DCL4 of other plant species evolved much earlier than I. batatas DCL4. The DCL4 of Solanum and Nicotiana spp. evolved much earlier than those of Ipomoea spp. (Appendix Figure 8).

Segregation analysis of defense genes using SSRs
From a total of 222 SSR generated primers, 63 SSR IbDCL2c_2 (AGTAAA)2 GCAAGAATCGAATTTAGTGCTC Marker I TTCCCGAAATGTCTACTGCTAT primer sets were used in downstream analysis, from which nine showed heterozygous bands when evaluated in the parent cultivars, and were assumed to represent markers (Table 3).
From the nine heterozygous SSRs on the different chromosomes, we identified 449 alleles in the 50 progenies, among which 51.44% segregated monosomically, 37.27% were co-segregated, and 9.3% fitted the expected disomic inheritance model; trisomic and nullisomic segregation was low (1.55 and 0.44%, respectively) (Table 4). There was deviation (P ≤ 0.01) from the disomic inheritance model for segregation of all markers (Table 4).
Inheritance of the defense gene SSRs varied within the progenies, as indicated by the different models of segregation for the markers (Table 4). Inheritance models of markers were found as follows: B, D, and C were disomic, co-segregation, monosomic, and trisomic; E was disomic, co-segregation, nullisomic, and monosomic; F, G, and I were disomic, co-segregating, and monosomic; A was disomic and co-segregating; and H was cosegregating and monosomic. Co-segregation inheritance dominated for markers A, D, and C, while monosomic inheritance dominated for markers B, I, and E. Marker A had the highest proportion of co-segregating progenies (96%), while marker E had the lowest proportion of nullisomic progenies (2%) ( Table 4).

DISCUSSION
Using in silico predictions from the sweet potato genome (Yang et al., 2017;Wu et al., 2018), we identified sweet potato putative defense genes, their variants and their microsatellites (SSR markers) and evaluated their segregation patterns. This is the first study of SSRs from specific chromosome locations, gene coding or involved in virus RNA silencing, and their segregation as potential virus defense gene markers in sweet potato progenies. Eight putative defense genes were derived using high and low stringency cut-off values; low stringency prediction has previously been used to derive resistance genes in sugarcane (Wanderley-Nogueira et al., 2007) and to identify defense gene variants (Table 1) in I. trifida Table 4. Test for progeny segregation of putative virus defense gene SSRs in a population of 50 seed progeny crosses between "New Kawogo" and "Resisto". and I. triloba (Wu et al., 2018), I. nil (Morgulis et al., 2008), and potato (Hunter et al., 2016). The defense genes of sweet potato were phylogenetically related to defense genes in other plants (Appendix Figures 1 to 8).

Marker
Specifically, there was close relationship within Ipomoea spp. This is confirmed by a related report that was made by Feng et al., (2018) about the evolutionary relationship between I. batatas (sweet potato) and wild relatives I. trifida and I. triloba.
In the present study, the detection of microsatellites (SSR markers) within the DNA coding regions of the defense genes indicates an improvement in understanding of defense genes and virus resistance compared with previous knowledge (Mwanga et al., 2002;Yada et al., 2017). It is important to note that the sweet potato SSRs currently known (Parado, 2010;Wang et al., 2011) are randomly located within the genome and tend to be difficult to develop or study without the use of sophisticated equipment (Schafleitner et al., 2010;Wang et al. 2011); however, the method used in the present is inexpensive and targeted to specific genes and chromosomes. It was found that pentanucleotide repeats were the most abundant (52.04%), followed by hexanucleotide repeats (20.49%) ( Table 2). In contrast, hexanucleotide repeats are most frequent (46.38%) in arum lily (Zantedeschia aethiopica), followed by monorepeats (31.86%) (Radhika et al., 2011), trinucleotide repeat motifs dominate in citrus and jatropha (Wen et al., 2010), and di-nucleotide repeats dominate in potato (Tang et al., 2009). This study is the first to report the presence of all major forms of repeat motif (mononucleotide, dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide) within IbRDR1, demonstrating a considerable increase in the number of available genetic markers for sweet potato.
Segregation of putative sweet potato defense genes was analyzed in progenies, and found it tended to be disomic, co-segregating, nullisomic, monosomic, and tetrasomic (Table 4), confirmed by deviations from the expected Mendelian segregation ratio of 1:2:1 (P ≤ 0.01). This deviation may be complex, because no clear pattern of segregation of defense genes was found in sweetpotato. In a study of progeny from different parental crosses, Mwanga et al. (2002) reported that resistance genes segregate in both Mendelian and non-Mendelian patterns, and we suggest this may have occurred in the present study (Table 4). The present results is in contrast with those reported by Rukarwa et al., (2013) for segregation of the cry7Aa1 gene for weevil resistance in sweetpotato, which segregated in a Mendelian pattern, as would usually be expected for a transgene. When each marker was considered, some progenies inherited genes disomically and fitted well to the Mendelian segregation model (Table 4), indicating almost perfect crossing, whereas other progenies had chromosome doubling (1.55% tetrasomic inheritance) or reduction (0.44% nullisomic inheritance). Interestingly, varied forms of segregation were found within a marker in different progenies (Table 4), indicating that marker segregation in sweet potato may be highly variable among progenies. It is also possible that the allelic composition of a particular defense gene varies among progenies, where it could be an underlying factor in the variability of reversion potential in different sweet potato genotypes (Wasswa et al., 2011; and in the variability of disease and pest resistance in potato (Yermishin et al., 2016). Variable patterns of segregation in progenies may also be attributed to segregation distortion of different genes and chromosomes during crossing, as has been reported for barley (Liu et al., 2011) and coffee (Ky et al., 2000), possibly because the large sweet potato chromosome number (2n=6x=90) may be subject to segregation distortion and a high level of cross incompatibility (Knox and Ellis, 2002;Yamagishi et al., 2010). The breeding of provitamin A-rich orange-fleshed sweet potato with virus resistance is a priority in East Africa (Low et al., 2017). There are, therefore, immediate opportunities for use of this resistance marker gene technique in crop breeding, as demonstrated here, that includes crossing parents with important characteristics (virus resistant, white flesh "New Kawogo" and virus susceptible, orange flesh "Resisto"). The approaches used here may be easily applied to SSR markers for the provitamin A synthetic pathway (Wu et al., 2018) for further development of sweet potato cultivars.

Conclusion
Identification of putative virus resistance genes in the sweet potato genome and development of SSR markers using bioinformatics tools is potentially more efficient than using traditional methods. The SSRs detected in this study may be used in molecular breeding and development of resistance gene analogs, and gene clustering studies of this culturally and economically important crop. This detection of important defense genes in polyploid sweet potato suggests this may be equally possible for other complex genomes, like those of potato and peanut.

APPENDIX
Phylograms showing evolutionary relationships of defense genes of sweet potato to other plant species.